21 - MLPDES25: Allen-Cahn message passing in graph neural networks and fast Sinkhorn for Wasserstein-1 metric [ID:57541]

50 von 192 angezeigt

Yeah, so it's a great pleasure to be here. I don't consider myself really in the field

of machine learning, but I have a lot of colleagues in the field, so once in a while I bump into

them and have some conversation, which eventually lead to some of the works here actually. These

are the two. The first is some result about graph neural networks, and second part is a fast

algorithm that we constructed to compute very particular metric, it's less than one distance.

So first part is a group of colleagues in my institute, particular Yu Guang Wang, and second

is a group of Tsinghua University Professor Wu Hao. All right, so in many machine learning

problems you run into data that has certain graph structures. So there you are talking about

a graph with nodes, edges. So the standard data we call them has a Euclidean structure,

which usually takes pixels in regular grids, but there are many applications where you have

to deal with non-Euclidean structure. So here we can see the so-called problem of the node

classifications in the graph neural networks. We assume this one graph where this network has

to use a label the nodes with edge information. So there are a lot of applications where this

kind of thing is important, for example in graph design, with repurposing knowledge,

recommend the system, recommendation system in e-commerce, movie reviews, book reviews,

and even public opinion analysis and so on. So here is the graph neural network, so we have this

graph, assuming it now is an undirected graph, so you have edges and nodes, so you want to have

a network which can reflect these geometric structures of a graph. So here is the typical

graph neural networks, so you start with the else layer, and you try to get to the next layer,

denoted by this superscript L plus one. So here A is the adjacent matrix, which builds

the connectivity between different nodes and edge information. So you write it as I is the identity

matrix, then this D, these D matrices are essentially the, it's called it's layer specific training or

read matrices. Well H is some activation functions, building the matrix of activation

functions in each layer, that's H. And here, well okay, you can use the standard activation

function for example, the loop function. And one can write such a new graph neural neural into a

so-called a message passing system, where you consider each layer from layer L to layer L plus

one, you pass a message from one layer to the next, it's a message passing system. So suppose

you are on the layer, the K minus one, you have the data XJ, and through some activation functions

you move to the next layer. So here you have some operator which corresponding to so-called

node permutation invariant functions. For example, you have sums, means, maximum and so on. Then you

have this other activation function which assume is differential, then you pass to the next layer.

Well there are some standard way to build a graph neural network, there's here two standard

examples, graph convolution network is over here. So here J consider all the nodes which connect to

node I, so you add all these nodes and sigma is the matrix which is a web chain. And there's also

graph attention neural network, so here it is just a matrix AIJ, has something to me similar to what

we heard yesterday about the transformers. So here the Liky-Raloo is basically, the Raloo is

this is X when X is part of zero, and here instead of zero you have a non-zero linear functions when

X is negative. The way to do it is to avoid the vanishing of the gradient, the gradient descent,

you have a zero derivative then you stop there, so try to avoid that, that's the only purpose for this.

Then one of the bottleneck in graph neural network is this over-smoothing issues which

you also heard when we just talk about the transformers. So you have all this data and

eventually all of them form some kind of consensus. If you consider neurons as a particle,

all these particle will go to the same point, which are provided from if you have a face recognition,

Shijin, Professor Wei and so on, and in the end you cannot distinguish different pictures,

so that's called over-smoothing. In particular, this is provided you to use the deep layers,

usually have a very shallow layer, there's no problem, you have deep layers, you have this

over-smoothing issues in a typical graph neural network. How do you characterize this over-smoothing?

A good way to characterize is to use Dirichlet energy. So this Dirichlet energy essentially you

have all these points, you can consider the point as particles, so the distance, and here

is a ij, the matrix, assume it's positive definite, that you normalize. So the over-smoothing

Teil einer Videoserie :

MLPDES25 • Machine Learning and PDEs Workshop

Presenters

Prof. Shi Jin

Zugänglich über

Offener Zugang

Dauer

00:24:05 Min

Aufnahmedatum

2025-04-30

Hochgeladen am

2025-04-30 17:14:18

Sprache

en-US

https://mod.fau.eu/mlpdes25/

#MLPDES25 Machine Learning and PDEs Workshop

Mon. – Wed. April 28 – 30, 2025

HOST: FAU MoD, Research Center for Mathematics of Data at FAU, Friedrich-Alexander-Universität Erlangen-Nürnberg Erlangen – Bavaria (Germany)

https://mod.fau.eu/mlpdes25/

SPEAKERS

• Paola Antonietti. Politecnico di Milano
• Alessandro Coclite. Politecnico di Bari
• Fariba Fahroo. Air Force Office of Scientific Research
• Giovanni Fantuzzi. FAU MoD/DCN-AvH, Friedrich-Alexander-Universität Erlangen-Nürnberg
• Borjan Geshkovski. Inria, Sorbonne Université
• Paola Goatin. Inria, Sophia-Antipolis
• Shi Jin. SJTU, Shanghai Jiao Tong University
• Alexander Keimer. Universität Rostock
• Felix J. Knutson. Air Force Office of Scientific Research
• Anne Koelewijn. FAU MoD, Friedrich-Alexander-Universität Erlangen-Nürnberg
• Günter Leugering. FAU, Friedrich-Alexander-Universität Erlangen-Nürnberg
• Lorenzo Liverani. FAU, Friedrich-Alexander-Universität Erlangen-Nürnberg
• Camilla Nobili. University of Surrey
• Gianluca Orlando. Politecnico di Bari
• Michele Palladino. Università degli Studi dell’Aquila
• Gabriel Peyré. CNRS, ENS-PSL
• Alessio Porretta. Università di Roma Tor Vergata
• Francesco Regazzoni. Politecnico di Milano
• Domènec Ruiz-Balet. Université Paris Dauphine
• Daniel Tenbrinck. FAU, Friedrich-Alexander-Universität Erlangen-Nürnberg
• Daniela Tonon. Università di Padova
• Juncheng Wei. Chinese University of Hong Kong
• Yaoyu Zhang. Shanghai Jiao Tong University
• Wei Zhu. Georgia Institute of Technology

SCIENTIFIC COMMITTEE

• Giuseppe Maria Coclite. Politecnico di Bari

• Enrique Zuazua. FAU MoD/DCN-AvH, Friedrich-Alexander-Universität Erlangen-Nürnberg

ORGANIZING COMMITTEE

• Darlis Bracho Tudares. FAU MoD/DCN-AvH, Friedrich-Alexander-Universität Erlangen-Nürnberg

• Nicola De Nitti. Università di Pisa

• Lorenzo Liverani. FAU DCN-AvH, Friedrich-Alexander-Universität Erlangen-Nürnberg

SEE MORE: https://mod.fau.eu/mlpdes25/

Video teaser of the #MLPDES25 Workshop: https://youtu.be/4sJPBkXYw3M

#FAU #FAUMoD #MLPDES25 #workshop #erlangen #bavaria #germany #deutschland #mathematics #research #machinelearning #neuralnetworks

Tags

Per RSS abonnieren